Skip to main content

Regression Rank: Learning to Meet the Opportunity of Descriptive Queries

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5478))

Abstract

We present a new learning to rank framework for estimating context-sensitive term weights without use of feedback. Specifically, knowledge of effective term weights on past queries is used to estimate term weights for new queries. This generalization is achieved by introducing secondary features correlated with term weights and applying regression to predict term weights given features. To improve support for more focused retrieval like question answering, we conduct document retrieval experiments with TREC description queries on three document collections. Results show significantly improved retrieval accuracy.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Allan, J., Connell, M., Croft, W.B., Feng, F.F., Fisher, D., Li, X.: INQUERY and TREC-9. In: Proc. of TREC-9, pp. 551–562 (2000)

    Google Scholar 

  2. Bendersky, M., Croft, W.B.: Discovering key concepts in verbose queries. In: Proc. of SIGIR, pp. 491–498. ACM, New York (2008)

    Google Scholar 

  3. Brants, T., Franz, A.: Web 1T 5-gram v1, LDC Catalog No. LDC2006T13 (2006)

    Google Scholar 

  4. Buckley, C., Harman, D.: Reliable information access final workshop report. ARDA Northeast Regional Research Center Technical Report (2004)

    Google Scholar 

  5. Diaz, F., Metzler, D.: Improving the estimation of relevance models using large external corpora. In: Proc. of SIGIR, pp. 154–161 (2006)

    Google Scholar 

  6. Graff, D., Kong, J., Chen, K., Maeda, K.: English Gigaword. In: Linguistic Data Consortium catalog number LDC2005T12 (2005)

    Google Scholar 

  7. Joachims, T., Li, H., Liu, T.-Y., Zhai, C.: Learning to rank for information retrieval (lr4ir 2007). SIGIR Forum 41(2), 58–62 (2007)

    Article  Google Scholar 

  8. Kumaran, G., Allan, J.: A Case for Shorter Queries, and Helping Users Create Them. In: Proceedings of NAACL HLT, pp. 220–227 (2007)

    Google Scholar 

  9. Kumaran, G., Allan, J.: Effective and efficient user interaction for long queries. In: Proc. of SIGIR, pp. 11–18 (2008)

    Google Scholar 

  10. Lafferty, J., Zhai, C.: Document language models, query models, and risk minimization for information retrieval. In: Proc. of SIGIR, pp. 111–119 (2001)

    Google Scholar 

  11. Lavrenko, V., Croft, W.B.: Relevance based language models. In: Proceedings of the 24th ACM SIGIR conference, pp. 120–127 (2001)

    Google Scholar 

  12. Lease, M.: Natural language processing for information retrieval: the time is ripe (again). In: Proceedings of the 1st Ph.D. Workshop at the ACM Conference on Information and Knowledge Management (PIKM) (to appear, 2007)

    Google Scholar 

  13. Lease, M.: Brown at TREC 2008 Relevance Feedback Track. In: Proc. of the 17th Text Retrieval Conference (TREC) Conference (2008)

    Google Scholar 

  14. Lease, M., Charniak, E.: A Dirichlet-smoothed Bigram Model for Retrieving Spontaneous Speech. In: Peters, C., Jijkoun, V., Mandl, T., Müller, H., Oard, D.W., Peñas, A., Petras, V., Santos, D. (eds.) CLEF 2007. LNCS, vol. 5152, pp. 687–694. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  15. McClosky, D., Charniak, E., Johnson, M.: Effective self-training for parsing. In: Proc. of HLT-NAACL 2006, pp. 152–159 (2006)

    Google Scholar 

  16. Metzler, D., Croft, W.B.: A Markov random field model for term dependencies. In: Proc. of SIGIR, pp. 472–479 (2005)

    Google Scholar 

  17. Metzler, D., Croft, W.B.: Linear feature-based models for information retrieval. information retrieval 10(3), 257–274 (2007)

    Article  Google Scholar 

  18. Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: Proc. of SIGIR, pp. 275–281 (1998)

    Google Scholar 

  19. Porter, M.: The Porter Stemming Algorithm, http://www.tartarus.org/martin/PorterStemmer

  20. Reynar, J.C., Ratnaparkhi, A.: A maximum entropy approach to identifying sentence boundaries. In: Proceedings of the fifth conference on Applied natural language processing, pp. 16–19 (1997)

    Google Scholar 

  21. Singhal, A., Buckley, C., Mitra, M.: Pivoted document length normalization. In: Proc. of SIGIR, pp. 21–29 (1996)

    Google Scholar 

  22. Smucker, M.D., Allan, J., Carterette, B.: A comparison of statistical significance tests for information retrieval evaluation. In: Proc. of CIKM, pp. 623–632 (2007)

    Google Scholar 

  23. Sparck Jones, K., Walker, S., Robertson, S.E.: A probabilistic model of information retrieval: development and comparative experiments (parts i and ii). Information Processing and Management 36, 779–840 (2000)

    Article  Google Scholar 

  24. Strohman, T., Metzler, D., Turtle, H., Croft, W.B.: Indri: A language model-based search engine for complex queries. In: Proceedings of the International Conference on Intelligence Analysis (2004)

    Google Scholar 

  25. Zhai, C., Lafferty, J.: Model-based feedback in the language modeling approach to information retrieval. In: Proc. of CIKM, pp. 403–410 (2001)

    Google Scholar 

  26. Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Trans. Inf. Syst. 22(2), 179–214 (2004)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lease, M., Allan, J., Croft, W.B. (2009). Regression Rank: Learning to Meet the Opportunity of Descriptive Queries. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds) Advances in Information Retrieval. ECIR 2009. Lecture Notes in Computer Science, vol 5478. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00958-7_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00958-7_11

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00957-0

  • Online ISBN: 978-3-642-00958-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics