Information Retrieval

, Volume 10, Issue 3, pp 257–274 | Cite as

Linear feature-based models for information retrieval

  • Donald Metzler
  • W. Bruce Croft
Article

Abstract

There have been a number of linear, feature-based models proposed by the information retrieval community recently. Although each model is presented differently, they all share a common underlying framework. In this paper, we explore and discuss the theoretical issues of this framework, including a novel look at the parameter space. We then detail supervised training algorithms that directly maximize the evaluation metric under consideration, such as mean average precision. We present results that show training models in this way can lead to significantly better test set performance compared to other training methods that do not directly maximize the metric. Finally, we show that linear feature-based models can consistently and significantly outperform current state of the art retrieval models with the correct choice of features.

Keywords

Retrieval models Linear models Features Direct maximization 

References

  1. Baeza-Yates, R., & Navarro, G. (1999). Modern information retrieval. New York: Addison-Wesley.Google Scholar
  2. Brin, S., & Page, L. (1998). The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1–7), 107–117.Google Scholar
  3. Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., et al. (2005). Learning to rank using gradient descent. In ICML’05: Proceedings of the 22nd International Conference on Machine Learning (pp. 89–96).Google Scholar
  4. Burges, C. J. C. (1998). A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2), 121–167.Google Scholar
  5. Clarke, C., Craswell, N., & Soboroff, I. (2004). Overview of the TREC 2004 Terabyte Track. In Online Proceedings of the 2004 Text Retrieval Conference.Google Scholar
  6. Craswell, N., Robertson, S., Zaragoza, H., & Taylor, M. (2005). Relevance weighting for query independent evidence. In Proceedings of the 28th Annual international ACM SIGIR conference on Research and Development in Information Retrieval (pp. 416–423).Google Scholar
  7. Cronen-Townsend, S., Zhou, Y., & Croft, W. B. (2002). Predicting query performance. In SIGIR’02: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 299–306).Google Scholar
  8. Gao, J., Qi, H., Xia, X., & Nie, J.-Y. (2005). Linear discriminant model for information retrieval. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 290–297).Google Scholar
  9. Gey, F. (1994). Inferring probability of relevance using the method of logistic regression. In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 222–231).Google Scholar
  10. Harman, D. (2004). Overview of the TREC 2002 novelty track. In Proceedings of the 2002 Text Retrieval Conference.Google Scholar
  11. Joachims, T. (2002). Optimizing search engines using clickthrough data. In Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 133–142).Google Scholar
  12. Joachims, T. (2005). A support vector method for multivariate performance measures. In Proceedings of the International Conference on Machine Learning (pp. 377–384).Google Scholar
  13. Joachims, T., Granka, L., Pan, B., Hembrooke, H., & Gay, G. (2005). Accurately interpreting clickthrough Data as implicit feedback. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 154–161).Google Scholar
  14. Kraaij, W., Westerveld, T., & Hiemstra, D. (2002). The importance of prior probabilities for entry page search. In Proceedings of SIGIR 2002 (pp. 27–34).Google Scholar
  15. Lafferty, J., & Zhai, C. (2001). Document language models, query models, and risk minimization for information retrieval. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 111–119).Google Scholar
  16. Lebanon, G., & Lafferty, J. (2004). Hyperplane margin classifiers on the multinomial manifold. In Proceedings of the Twenty-First International Conference on Machine Learning (pp. 66–71).Google Scholar
  17. Matveeva, I., Burges, C., Burkard, T., Laucius, A., & Wong, L. (2006). High accuracy retrieval with multiple nested ranker. In Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 437–444).Google Scholar
  18. Metzler, D., & Croft, W. B. (2005). A Markov random field model for term dependencies. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 472–479).Google Scholar
  19. Metzler, D., Strohman, T., Turtle, H., & Croft, W. B. (2004). Indri at Terabyte Track 2004. In Online Proceedings of the 2004 Text Retrieval Conference.Google Scholar
  20. Metzler, D., Strohman, T., Zhou, Y., & Croft, W. B. (2005). Indri at Terabyte Track 2005. In Online Proceedings of the 2005 Text Retrieval Conference.Google Scholar
  21. Mishne, G., & de Rijke, M. (2005). Boosting Web retrieval through query operators. In Proceedings of the 27th European Conference on Information Retrieval (pp. 502–516).Google Scholar
  22. Morgan, W., Greiff, W., & Henderson, J. (2004). Direct maximization of average precision by hill-climbing with a comparison to a maximum entropy approach, Technical report, MITRE, http://www.mitre.org/work/tech_papers/tech_papers_04/morgan_hill/morgan_hill.pdfGoogle Scholar
  23. Morik, K., Brockhausen, P., & Joachims, T. (1999). Combining statistical learning with a knowledge-based approach—A case study in intensive care monitoring. In Proceedings of the 16th International Conference on Machine Learning (pp. 268–277). Google Scholar
  24. Nallapati, R. (2004). Discriminative models for information retrieval. In Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 64–71).Google Scholar
  25. Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment classification using machine learning techniques. In Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 79–86).Google Scholar
  26. Pietra, S. D., Pietra, V. D., & Lafferty, J. (1997). Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4), 380–393.Google Scholar
  27. Ponte, J. M., & Croft, W. B. (1998). A language modeling approach to information retrieval. In Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 275–281).Google Scholar
  28. Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. (1992). Numerical recipes in C: The art of scientific computing. Cambridge, UK: Cambridge University Press, ISBN 0521431085.Google Scholar
  29. Robertson, S., Walker, S., Beaulieu, M. M., & Gatford, M. (1995). Okapi at TREC-4. In Online Proceedings of the Fourth Text Retrieval Conference (pp. 73–96).Google Scholar
  30. Shen, X., & Zhai, C. (2005). Active feedback in ad hoc information retrieval. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 59–66).Google Scholar
  31. Si, L., & Callan, J. (2001). A statistical model for scientific readability. In CIKM’01: Proceedings of the Tenth International Conference on Information and Knowledge Management (pp. 574–576).Google Scholar
  32. Zhai, C. (2002). Risk minimization and language modeling in information retrieval. PhD thesis, Carnegie Mellon University, Pittsburgh, PA, http://www.cs.cmu.edu/~czhai/thesis.pdf.Google Scholar
  33. Zhai, C., & Lafferty, J. (2001). A study of smoothing methods for language models applied to ad-hoc information retrieval. In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 334–342).Google Scholar
  34. Zhang, D., Chen, X., & Lee, W. S. (2005). Text classification with kernels on the multinomial manifold. In Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 266–273).Google Scholar
  35. Zhou, Y., & Croft, W. B. (2005). Document quality models for Web ad hoc retrieval. In CIKM’05: Proceedings of the 14th ACM International Conference on Information and Knowledge Management (pp. 331–332).Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2006

Authors and Affiliations

  • Donald Metzler
    • 1
  • W. Bruce Croft
    • 1
  1. 1.University of MassachusettsAmherstUSA

Personalised recommendations