Skip to main content

Part of the book series: The Information Retrieval Series ((INRE,volume 27))

  • 1087 Accesses

Abstract

This chapter demonstrates how feature-based models can be extended and used for query expansion using a technique known as latent concept expansion (LCE). The approach has three key benefits, including the ability to go beyond the bag of words assumption, the ability to employ arbitrary features during the query expansion process, and the ability to expand with a variety of concept types beyond unigrams. In addition to the basic LCE model, the chapter also describes a number of powerful extensions, including generalized LCE and LCE using hierarchical MRFs that encode document structure during the expansion process.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 54.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Blei, D., Ng, A., & Jordan, M. (2003b). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.

    MATH  Google Scholar 

  • Broder, A., Ciccolo, P., Gabrilovich, E., Josifovski, V., Metzler, D., Riedel, L., & Yuan, J. (2009). Online expansion of rare queries for sponsored search. In Proceedings of the 18th international conference on World Wide Web, WWW ’09 (pp. 511–520). New York: ACM.

    Chapter  Google Scholar 

  • Buckley, C., & Salton, G. (1995). Optimization of relevance feedback weights. In Proc. 18th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 351–357).

    Chapter  Google Scholar 

  • Clarke, C. L. A., & Cormack, G. V. (2000). Shortest-substring retrieval and ranking. ACM Transactions on Information Systems, 18(1), 44–78.

    Article  Google Scholar 

  • Collins-Thompson, K., & Callan, J. (2005). Query expansion using random walk models. In Proc. 14th intl. conf. on information and knowledge management (pp. 704–711).

    Google Scholar 

  • Croft, W. B. (1986). Boolean queries and term dependencies in probabilistic retrieval models. Journal of the American Society for Information Science, 37(4), 71–77.

    Google Scholar 

  • Croft, W. B., Turtle, H., & Lewis, D. (1991). The use of phrases and structured queries in information retrieval. In Proc. 14th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 32–45).

    Chapter  Google Scholar 

  • Fagan, J. (1987). Automatic phrase indexing for document retrieval: An examination of syntactic and non-syntactic methods. In Proc. tenth ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 91–101).

    Chapter  Google Scholar 

  • Harper, D., & van Rijsbergen, C. J. (1978). An evaluation of feedback in document retrieval using co-occurrence data. Journal of Documentation, 34(3), 189–216.

    Article  Google Scholar 

  • Ji, X. & Zha, H. (2003). Domain-independent text segmentation using anisotropic diffusion and dynamic programming. In Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’03 (pp. 322–329). New York: ACM.

    Google Scholar 

  • Kurland, O., & Lee, L. (2004). Corpus structure, language models, and ad hoc information retrieval. In Proc. 27th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 194–201).

    Google Scholar 

  • Lavrenko, V., & Croft, W. B. (2001). Relevance-based language models. In Proc. 24th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 120–127).

    Chapter  Google Scholar 

  • Liu, X., & Croft, W. B. (2004). Cluster-based retrieval using language models. In Proc. 27th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 186–193).

    Google Scholar 

  • Macdonald, C. & Ounis, I. (2007). Expertise drift and query expansion in expert search. In Proceedings of the sixteenth ACM conference on information and knowledge management, CIKM ’07 (pp. 341–350). New York: ACM.

    Chapter  Google Scholar 

  • Metzler, D., & Croft, W. B. (2005). A Markov random field model for term dependencies. In Proc. 28th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 472–479).

    Chapter  Google Scholar 

  • Metzler, D., & Croft, W. B. (2007). Latent concept expansion using Markov random fields. In Proc. 30th ann. intl. ACM SIGIR conf. on research and development in information retrieval.

    Google Scholar 

  • Metzler, D., Strohman, T., Turtle, H., & Croft, W. B. (2004b). Indri at TREC 2004: Terabyte track. In Proc. 13th intl. conf. on World Wide Web.

    Google Scholar 

  • Metzler, D., Strohman, T., Zhou, Y., & Croft, W. B. (2005b). Indri at TREC 2005: terabyte track. In Proc. 14th intl. conf. on World Wide Web.

    Google Scholar 

  • Murdock, V., & Croft, W. B. (2005). A translation model for sentence retrieval. In Proc. HLT ’05 (pp. 684–691). Morristown: Association for Computational Linguistics.

    Chapter  Google Scholar 

  • Papka, R., & Allan, J. (1997). Why bigger windows are better than smaller ones (Technical report). University of Massachusetts, Amherst.

    Google Scholar 

  • Ponte, J., & Croft, W. B. (1998). A language modeling approach to information retrieval. In Proc. 21st ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 275–281).

    Chapter  Google Scholar 

  • Rocchio, J. J. (1971). Relevance feedback in information retrieval (pp. 313–323). New York: Prentice-Hall.

    Google Scholar 

  • Tao, T., Wang, X., Mei, Q., & Zhai, C. (2006). Language model information retrieval with document expansion. In Proc. of HLT/NAACL (pp. 407–414).

    Google Scholar 

  • van Rijsbergen, C. J. (1977). A theoretical basis for the use of cooccurrence data in information retrieval. Journal of Documentation, 33(2), 106–119.

    Article  Google Scholar 

  • Wei, X., & Croft, W. B. (2006). LDA-based document models for ad-hoc retrieval. In Proc. 29th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 178–185).

    Chapter  Google Scholar 

  • Xu, J., & Croft, W. B. (2000). Improving the effectiveness of information retrieval with local context analysis. ACM Transactions on Information Systems, 18(1), 79–112.

    Article  Google Scholar 

  • Zhai, C., & Lafferty, J. (2001a). Model-based feedback in the language modeling approach to information retrieval. In Proc. 10th intl. conf. on information and knowledge management (pp. 403–410).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Donald Metzler .

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Metzler, D. (2011). Feature-Based Query Expansion. In: A Feature-Centric View of Information Retrieval. The Information Retrieval Series, vol 27. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22898-8_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22898-8_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22897-1

  • Online ISBN: 978-3-642-22898-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics