Skip to main content
Log in

A quadratic lower bound for Rocchio’s similarity-based relevance feedback algorithm with a fixed query updating factor

Journal of Combinatorial Optimization Aims and scope Submit manuscript

Abstract

Rocchio’s similarity-based relevance feedback algorithm, one of the most important query reformation methods in information retrieval, is essentially an adaptive supervised learning algorithm from examples. In practice, Rocchio’s algorithm often uses a fixed query updating factor. When this is the case, we strengthen the linear Ω(n) lower bound obtained by Chen and Zhu (Inf. Retr. 5:61–86, 2002) and prove that Rocchio’s algorithm makes Ω(k(nk)) mistakes in searching for a collection of documents represented by a monotone disjunction of k relevant features over the n-dimensional binary vector space {0,1}n, when the inner product similarity measure is used. A quadratic lower bound is obtained when k is linearly proportional to n. We also prove an O(k(nk)3) upper bound for Rocchio’s algorithm with the inner product similarity measure in searching for such a collection of documents with a constant query updating factor and a zero classification threshold.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

  • Angluin D (1987) Queries and concept learning. Mach Learn 2(4):319–432

    Google Scholar 

  • Baeza-Yates R, Ribeiro-Neto B (eds) (1999) Modern information retrieval. Addison-Wesley, Reading

    Google Scholar 

  • Chen Z (2001) Multiplicative adaptive algorithms for user preference retrieval. In: Proceedings of the seventh annual international computing and combinatorics conference. LNCS, vol 2108. Springer, Berlin, pp 540–549

    Google Scholar 

  • Chen Z (2004) Multiplicative adaptive user preference retrieval and its applications to Web search. In: Zhang G, Kandel A, Lin T, Yao Y (eds) Computational Web intelligence: intelligent technology for web applications. World Scientific, Singapore, pp 303–328

    Google Scholar 

  • Chen Z, Fu B (2005) A quadratic lower bound for Rocchio’s similarity-based relevance feedback algorithm. In: Proceedings of the seventh annual international computing and combinatorics conference. LNCS, vol 3595. Springer, Berlin, pp 955–964

    Google Scholar 

  • Chen Z, Fu B (2007) On the complexity of Rocchio’s similarity-based relevance feedback algorithm. J Am Soc Inf Sci Technol 58(10):1392–1400

    Article  Google Scholar 

  • Chen Z, Meng X (2000) Yarrow: a real-time client site meta search learner. In: Proceedings of the AAAI 2000 workshop on artificial intelligence for Web search. AAAI Press, pp 12–17

  • Chen Z, Zhu B (2002) Some formal analysis of Rocchio’s similarity-based relevance feedback algorithm. Inf Retr 5:61–86. (The preliminary version of this paper appeared in Proceedings of the eleventh international symposium on algorithms and computation (ISAAC’00). LNCS, vol 1969, pp 108–119, 2000.)

    Article  MATH  Google Scholar 

  • Chen Z, Meng X, Fowler R, Zhu B (2001) Features: Real-time adaptive feature learning and document learning. J Am Soc Inf Sci 52(8):655–665

    Article  Google Scholar 

  • Chen Z, Meng X, Zhu B, Fowler R (2002) WebSail: From on-line learning to Web search. J Knowl Inf Sci 4:219–227 (The special issue of WISE’00)

    Google Scholar 

  • Frakes W, Baeza-Yates R (eds) (1992) Information retrieval: data structures and algorithms. Prentice-Hall, Englewood Cliffs

    Google Scholar 

  • Ide E (1971a) Interactive search strategies and dynamic file organization in information retrieval. In: Salton G (ed) The smart system—experiments in automatic document processing. Prentice-Hall, Englewood Cliffs, pp 373–393

    Google Scholar 

  • Ide E (1971b) New experiments in relevance feedback. In: Salton G (ed) The smart system—experiments in automatic document processing. Prentice-Hall, Englewood Cliffs, pp 337–354

    Google Scholar 

  • Lewis D (1991) Learning in intelligent information retrieval. In: Proceedings of the eighth international workshop on machine learning, pp 235–239

  • Littlestone N (1988) Learning quickly when irrelevant attributes abound: a new linear-threshold algorithm. Mach Learn 2:285–318

    Google Scholar 

  • Raghavan V, Wong S (1986) A critical analysis of the vector space model for information retrieval. J Am Soc Inf Sci 37(5):279–287

    Google Scholar 

  • Rocchio J (1971) Relevance feedback in information retrieval. In: Salton G (ed) The smart retrieval system—experiments in automatic document processing. Prentice-Hall, Englewood Cliffs, pp 313–323

    Google Scholar 

  • Salton G (eds) (1989) Automatic text processing: the transformation, analysis, and retrieval of information by computer. Addison-Wesley, Reading

    Google Scholar 

  • Salton G, Buckley C (1990) Improving retrieval performance by relevance feedback. J Am Soc Inf Sci 41(4):288–297

    Article  Google Scholar 

  • Salton G, Wong S, Yang C (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620

    Article  MATH  Google Scholar 

  • van Rijsbergen KCJ (1979) Information retrieval. Butterworths, London

    Google Scholar 

  • Wong S, Yao Y, Bollmann P (1988) Linear structures in information retrieval. In: Proceedings of the 1988 ACM-SIGIR conference on information retrieval, pp 219–232

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhixiang Chen.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, Z., Fu, B. & Abraham, J. A quadratic lower bound for Rocchio’s similarity-based relevance feedback algorithm with a fixed query updating factor. J Comb Optim 19, 134–157 (2010). https://doi.org/10.1007/s10878-008-9169-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10878-008-9169-6

Keywords

Navigation