A Hybrid Relevance-Feedback Approach to Text Retrieval

  • Zhao Xu
  • Xiaowei Xu
  • Kai Yu
  • Volker Tresp
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2633)

Abstract

Relevance feedback (RF) has been an effective query modification approach to improving the performance of information retrieval (IR) by interactively asking a user whether a set of documents are relevant or not to a given query concept. The conventional RF algorithms either converge slowly or cost a user’s additional efforts in reading irrelevant documents. This paper surveys several RF algorithms and introduces a novel hybrid RF approach using a support vector machine (HRFSVM), which actively selects the uncertain documents as well as the most relevant ones on which to ask users for feedback. It can efficiently rank documents in a natural way for user browsing. We conduct experiments on Reuters-21578 dataset and track the precision as a function of feedback iterations. Experimental results have shown that HRFSVM significantly outperforms two other RF algorithms.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    C. Burges. A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, (2):121–167, 1998.Google Scholar
  2. [2]
    D. Cohn and Z. Ghahramani. Active learning with statistical models. Journal of Artificial Intelligence Research, (4):129–145, 1996.Google Scholar
  3. [3]
    H. Drucker, B. Shahraray, and D. Gibbon. Relevance feedback using support vector machines. In Proceedings of the 18th International Conference on Machine Learning, pages 122–129, 2001.Google Scholar
  4. [4]
    S. Dumais, J. Platt, D. Heckerman, and M. Sahami. Inductive learning algorithms and representations for text categorization. In Proceedings of the Seventh International Conference on Information and Knowledge Management. ACM Press, 1998.Google Scholar
  5. [5]
    D. Harman. Relevance feedback revisited. In Proceedings of the Fifth International SIGIR Conference on Research and Development in Information Retrieval, pages 1–10, 1992.Google Scholar
  6. [6]
    T. Joachims. Text categorization with support vector machines. In Proceedings of the European Conference on Machine Learning. Springer Verlag, 1998.Google Scholar
  7. [7]
    D. Lewis and W. Gale. A sequential algorithm for training text classifiers. In Proceedings of the Eleventh International Conference on Machine Learning, pages 148–156. Morgan Kaufmann, 1994.Google Scholar
  8. [8]
    T. Mitchell. Generalization as search. Artificial Intelligence, (28):203–226, 1982.Google Scholar
  9. [9]
    J. J. Rocchio. Relevance feedback in information retrieval. In The SMART Retrieval System: Experiments in Automatic Document Processing, pages 313–323. Prentice Hall, 1971.Google Scholar
  10. [10]
    G. Salton and C. Buckley. Improving retrieval performance by relevance feedback. Journal of the American Society of Information Science, 41:288–297, 1990.CrossRefGoogle Scholar
  11. [11]
    G. Schohn and D. Cohn. Less is more: Active learning with support vector machines. In Proceedings of the Seventeenth International Conference on Machine Learning, 2000.Google Scholar
  12. [12]
    S. Tong and D. Koller. Support vector machine active learning with applications to text classification. Journal of Machine Learning Research, (2):45–66, 2001.Google Scholar
  13. [13]
    V. Vapnik. Estimation of Dependences Based on Empirical Data. Springer Verlag, 1982.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Zhao Xu
    • 1
  • Xiaowei Xu
    • 2
  • Kai Yu
    • 3
  • Volker Tresp
    • 4
  1. 1.Tsinghua UniversityBeijingP.R. China
  2. 2.University of Arkansas at Little RockLittle RockUSA
  3. 3.University of MunichMunichGermany
  4. 4.Siemens AG, Corporate TechnologyMunichGermany

Personalised recommendations